AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Autoregressive Multimodal

# Autoregressive Multimodal

Vila U 7b 256
MIT
VILA-U is a foundational model that unifies vision-language understanding and generation tasks, achieving efficient multimodal processing through a single autoregressive framework.
Text-to-Image
V
mit-han-lab
127
21
Janus 1.3B
MIT
Janus is a novel autoregressive framework that unifies multimodal understanding and generation. By decoupling visual encoding, it addresses the limitations of previous methods and enhances the flexibility of the framework.
Text-to-Image Transformers
J
deepseek-ai
12.44k
588
Anole 7b V0.1 Hf
Apache-2.0
Anole is an open-source autoregressive multimodal model capable of generating interleaved image-text sequences without relying on stable diffusion technology.
Text-to-Image Transformers English
A
leloy
22.83k
8
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase